Remove mmap backend #536

cberner · 2023-03-24T18:41:57Z

Fixes #458

I decided that the maintenance burden of supporting the mmap backend is too high for too small of a performance improvement. This will make redb provably safe and trivial to port to other platforms. Proving that the mmap backend was safe seemed infeasible, and this greatly simplifies the codebase.

The userspace cache backend is now within a factor of about 1.5-2x, and I've given up on proving that the mmap backend is sound. It's just too complex and there was already a soundness bug found in 1293d4f

Kerollmops · 2024-08-21T08:29:41Z

Hey @cberner 👋

I was wondering if the removal of the mmap backend means that the values are always allocated when retrieved? Which means that keeping too many values for too long can result in OOM?

cberner · 2024-08-21T14:02:56Z

No, it should not. There's a cache which holds pages (redb's Page abstraction, not OS pages) and they're evicted when the cache is full. The size is configurable

Kerollmops · 2024-08-24T15:48:15Z

Thank you very much for the answer, @cberner. I have a couple of other questions:

What if I keep a lot of AccessGuards for a long time? I get them from the <Table as ReadableTable>::get method.
Will the pages be kept for as long as I keep the guards, and therefore, it can blow up the memory?
Or will the page be retrieved lazily when I call AccessGuard::value?
And if it's the latter, I could keep the &[u8] it returns and keep them for a long time. What happens?

I am just thinking about this because I use LMDB and extensively use memory mapping behavior in arroy by keeping pointers to some values and read them in parallel.

cberner · 2024-08-24T16:05:34Z

Yes, the pages will be kept for the lifetime of the AccessGuard. If the values are small, you could copy them out of the AccessGuard and retain them for a long-time. Alternately, you can keep the key around, and retrieve the value from the Table on-demand, which will read from the cache if the value is cached and read from disk if it has been evicted

Kerollmops · 2024-08-24T16:24:26Z

Alternately, you can keep the key around, and retrieve the value from the Table on-demand, which will read from the cache if the value is cached and read from disk if it has been evicted

Thank you, @cberner; it seems to be the best alternative. My last question is: Is there a way to read a non-committed transaction in parallel?

Like reading the current transaction with multiple read transactions so that I can spawn multiple threads with each transaction and read the table in each (making sure no writes are performed while these read transactions live).

table.write(wtxn, "abc", "long embedding")?;
table.write(wtxn, "def", "another long embedding")?;

let rtxn_gen = wtxn.lock_for_read()?;
let rtxn0 = rtxn_gen.read_txn()?;
let rtxn1 = rtxn_gen.read_txn()?;

// not allowed while rtxn_gen lives
// table.write(wtxn, "def", "another long embedding")?;

assert_eq!(table.read(rtxn0, "abc")?, "long embedding");
assert_eq!(table.read(rtxn1, "def")?, "another long embedding");

rtxn0.abort();
rtxn1.abort();
drop(rtxn_gen);

table.write(wtxn, "ghi", "another super long embedding")?;

cberner · 2024-08-24T16:49:14Z

Read transactions can only read committed data, so you can't directly do that. The two options would be:

Tables implement Send so you can perform multithreaded reads of the Table during the write transaction. For example, see this Add test for multithreaded reads of a Table #848
commit (you can use a Durability::None commit, if you're concerned about latency) and then read

cberner added 4 commits March 24, 2023 11:41

Remove mmap backend

25903c5

The userspace cache backend is now within a factor of about 1.5-2x, and I've given up on proving that the mmap backend is sound. It's just too complex and there was already a soundness bug found in 1293d4f

Remove PhysicalStorage trait

089164d

Remove nearly all unsafe code

afa7a50

Remove PageHack and PageHackMut

b15f95c

cberner merged commit 619aa90 into master Mar 24, 2023

cberner deleted the mmap branch March 24, 2023 21:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove mmap backend #536

Remove mmap backend #536

cberner commented Mar 24, 2023 •

edited

Loading

Kerollmops commented Aug 21, 2024

cberner commented Aug 21, 2024

Kerollmops commented Aug 24, 2024 •

edited

Loading

cberner commented Aug 24, 2024

Kerollmops commented Aug 24, 2024 •

edited

Loading

cberner commented Aug 24, 2024

Remove mmap backend #536

Remove mmap backend #536

Conversation

cberner commented Mar 24, 2023 • edited Loading

Kerollmops commented Aug 21, 2024

cberner commented Aug 21, 2024

Kerollmops commented Aug 24, 2024 • edited Loading

cberner commented Aug 24, 2024

Kerollmops commented Aug 24, 2024 • edited Loading

cberner commented Aug 24, 2024

cberner commented Mar 24, 2023 •

edited

Loading

Kerollmops commented Aug 24, 2024 •

edited

Loading

Kerollmops commented Aug 24, 2024 •

edited

Loading